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Abstract — There is substantial interest in the effect of 
human mobility patterns on opportunistic communica- 
tions. Inspired by recent work revisiting some of the early 
evidence for a Levy flight foraging strategy in animals, 
we analyse datasets on human contact from real world 
traces. By analysing the distribution of inter-contact times 
on different time scales and using different graphical 
forms, we find not only the highly skewed distributions 
of waiting times highlighted in previous studies but also 
clear circadian rhythm. The relative visibility of these two 
components depends strongly on which graphical form is 
adopted and the range of time scales. We use a simple 
model to reconstruct the observed behaviour and discuss 
the implications of this for forwarding efficiency. 

I. Introduction 

Digital traffic flows not only over the wired backbone 
of the Internet or network of mobile phone masts, but 
also in small leaps through physical space as people 
pass one another on the street lfl4ll . Thus opportunities 
for a new communication paradigm via wireless-enabled 
devices are emerging, which communicate directly with 
other devices within their range and without a costly and 
inflexible planned infrastructure (e.g., [9]). To improve 
communication efficiency and prevent the spread of 
wireless viruses in this new generation of communication 
requires new insights and quantitative models of human 
interaction. Of fundamental importance in this case is the 
time sequence of human contacts, as well as other prop- 
erties of complex networks, such as small-worldness, etc. 
(e.g., the special issue of Science on Complex Systems 
and Networks, July 24, 2009). 

Recently, the emergence of human interaction traces 
from online and pervasive environments is allowing us 
to understand details of human activities. For example, 
the MIT Reality Mining project Q collected proximity, 
location and activity information, with nearby nodes 
being discovered through periodic Bluetooth scans and 
location information from cell tower IDs. Several other 
groups have performed similar studies. Some have used 



Bluetooth to measure device connectivity 0, O, [18], 
while others rely on WiFi Q3), GPS (H, [23, lfT51 . 
or the position of cell towers iflOll . The duration of 
experiments has varied from 2 days to over one year, 
and the numbers of participants has also varied from 
~ 10 to ~ 100,000. 

It has been suggested that the probability density 
function (pdf) p(t) of times between human contact 
is well approximated by a truncated power law i.e. 
p{t) ~ over some range. This is so whether 

the contact is by physical proximity (i.e., detectability of 
wireless access points or Bluetooth devices, or closeness 
of GPS locations BL fOll . ll22l ) or by telecommunica- 
tion (i.e., mobile phone call [10] or e-mail Ifl6l0 , and 
whether one or both contacting devices are in motion 
(e.g., both Bluetooth, one Bluetooth and fixed wireless 
access points, mobile phone and fixed masts). 

A summary is given in Table I of studies in which 
the stability exponent a has been inferred from an inter- 
contact time (ICT) distribution, together with the approx- 
imate range of applicability. From the quoted values, 
a is inferred to be in the interval [« 0,0.9] which is 
within the allowable range (0 < a < 2) for the tails of a 
Levy (stable) distribution [20], IPT71 (except possibly for 
the marginal case of the Europe study of mobile phone 
contact which could actually be a gamma distribution.) 
Consequently it has been argued that human mobility 
patterns resemble truncated Levy walks (TLW). The 
TLW paradigm represents a development of the Levy 
flight, which was a random walk comprising steps drawn 
from a Levy distribution, rather than a Gaussian as 
occurs in the more familiar Brownian random walks 
ll24l . The first modification, to a finite constant velocity, 
was dubbed a Levy walk. Subsequently the limitation 
to a finite domain was described as truncation [17]. 
More recently some researchers have also considered the 
velocity to be a variable (e.g. |[23l ). 

Similar movement patterns have also been inferred 
for animals [25], and it has been proposed that Levy 
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User population 


Intel 


Cambridge 1 


INFOCOM 
2005 


Toronto 


UCSD 


Dartmouth 


Europe 


Source 


Chaintreau et al. (2006) (3J 


Gonzalez et al. 
(2009) [ 10 1 


Device 


iMote 


iMote 


iMote 


PDA 


PDA 


Laptop/PDA 


Mobile phone 


Network type 


Bluetooth 


Bluetooth 


Bluetooth 


Bluetooth 


WiFi 


WiFi 


Mobile phone 


Granularity 


120 seconds 


120 seconds 


120 seconds 


120 seconds 


120 seconds 


300 seconds 


N/A 


Duration 


3 days 


5 days 


3 days 


16 days 


77 days 


1 14 days 


6 months 


Devices 
participating 


8 


12 


41 


23 


273 


6648 


100,000 


Number of inter- 
nal contacts 


1,091 


4,229 


22,459 


2,802 


195,364 


4,058,284 


16,364,308 


Approximate ex- 
tent of power law 
region 


4min - 14min 


lOmin - 30min 


lOmin - lOh 


2min - 6min 


20min - lday 


lOmin - lh 


100s - 8h 


Quoted power 
law exponent 


-0.9 


-0.9 


-0.4 


-0.9 


-0.3 


-0.3 


-0.9 +/- 0.1 


Type of distribu- 
tion plotted 


Tail df 
(ccdf) 


Tail df 

(ccdf) 


Tail df 

(ccdf) 


Tail df 

(ccdf) 


Tail df 

(ccdf) 


Tail df 

(ccdf) 


Log-binned pdf 


Inferred stabil- 
ity exponent a 


0.9 


0.9 


0.4 


0.9 


0.3 


0.3 


-0.1 +/- 0.1 



TABLE I: Summary of studies in which the stability exponent a has been inferred from an inter-contact 
time distribution 



foraging is an optimal strategy under at least some 
circumstances |[26l . Debate continues as to the extent to 
which a Levy strategy could be universal and insensitive 
to the details of the environment and of the physiology 
and motivation of the individual (e.g. (H, lt2~Tl . and refer- 
ences therein). However, the statistical analysis methods 
which have most frequently been used to infer empirical 
support for the truncated Levy walk hypothesis have 
recently been criticised, both in the ecology literature 
and more generally H, Q, 0, [27]]. Key problems 
identified have included: (1) The widespread inference 
of power law pdfs by the graphical method of straight 
line fitting to histograms with double logarithmic axes; 
(2) the difficulty of inferring power laws over very 
limited ranges; (3) the use of intrinsically biased methods 
(such as (1)) for estimating the power law exponent; 
and, perhaps most importantly, (4) inadequate, or even a 
complete lack of, alternative hypotheses. 

In the light of this, it is worthwhile to consider 
how these problems might apply to the human mobility 
studies cited above and summarised in Table I. For 



example, some simply compare their distributions with 
a straight line on a log-log plot with unavoidable bias 
and spread for the inferred power law exponent f3l . 
fl3ll . In addition, referring to Table I, the inference 
of a possible power law region is very weak for the 
Intel, Cambridge 1 and Toronto experiments because the 
region is so limited (~ 1/3 decade), presumably related 
to the small samples (~ 1000 contacts). The evidence 
is more convincing for the larger samples (INFOCOM 
2005, UCSD, Dartmouth and Europe) with wider ap- 
parent power law regions. Alternative hypotheses to the 
pure power law null model have been considered, such 
as the exponentially-truncated power law fOl . 1101 . but 
only one study fOl has actually fitted and quantitatively 
compared several alternative models to ICT distributions 
(albeit simulated), using the less biased maximum likeli- 
hood estimate to infer the model parameters such as the 
power law exponent and Akaike weights fH to compare 
the goodness of fits. Thus, at present, the inference of 
a truncated power law ICT distribution directly from 
experiment is limited. 
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Indeed, it would be surprising if a truncated power law 
was a complete description of human ICT considering 
our prior knowledge about the social habits and struc- 
tures of humans, such as the working day and family 
and community responsibilities ifToll . In fact it has been 
recognised that the ICT distribution is not stationary and 
changes with the time of day 1131 . Spatial movement 
distributions also exhibit daily patterns [10] and Fourier 
analyses of proximity edges have daily and weekly 
periodicities [6]. Similarly a fundamental semi-diurnal 
periodicity was identified in an early study claiming a 
Levy strategy for animal foraging |[25l . This suggests that 
alternative models combining non-trivial randomness 
and periodic rhythms should be investigated. At present 
such more complicated models are challenging to test 
rigorously (e.g., by MLE) but progress can nevertheless 
be made by closer examination of the experimental 
ICT distribution using different graphical methods and 
modelling. 

In this paper we consider three similar human contact 
experiments of varying durations (section 2). We analyse 
and model them to identify regularities that modify 
the underlying Levy walk behaviour (section 3). Then 
we briefly compare these analyses with others in the 
literature and discuss how this hybrid behaviour may 
be modelled and will affect the efficiency of ad-hoc 
communication (section 4). 

II. Datasets 

We analyse trace data from the Haggle project O 
and Crawdad database O, collected using Bluetooth 
communication in a conference environment and two 
university study environments. The configuration of 
data collection is summarised in Table II. 

MIT: in the MIT Reality Mining project (6), 100 
smart phones were deployed to students and staff at 
MIT over a period of 9 months. These phones were 
running software that logged contacts. 
Cambridge 2: in the Cambridge Haggle project |9), 
36 iMotes (Intel Mote ISN100-BA) were deployed to 
1st year and 2nd year undergraduate students for 11 
days to detect proximity using Bluetooth. The iMote 
runs TinyOS and is equipped with an ARM7TDMI 
processor operating at 12MHz, with 64kB of SRAM, 
512kB of flash storage, and a multi-coloured LED, and 
a Bluetooth 1.1 radio, which has a radio range around 
30 meters. 

INFOCOM 2006: also in the Cambridge Haggle 



Experimental data set 


MIT 


Climb ridge 2 




Device 


Phone 


iMote 


i\J[ rift* 

livioie 


Network type 


Bluetooth 


Bluetooth 


Bluetooth 


Duration L (days) 


246 


11 


3 


Granularity A (seconds) 


600 


100 


100 


Number of Devices 


97 


36 


77 


Number of Contacts 


54,667 


10,873 


191,336 


Average # Contacts / Day 


0.024 


0.345 


6.7 



TABLE II: Characteristics of the experiments 



project, 77 iMotes were deployed at the INFOCOM 
2006 conference for 3 days. 

The logged data from the above experimental studies 
are used to build time-dependent network information 
to study the distribution of contact times, inter-contact 
times, community structure and their statistical proper- 
ties, where we constructed discrete event traces of pair 
interactions of 10 to 600 seconds intervals. We have 
aggregated raw data within 100 or 600 second time 
windows to avoid uncertainty of device detection from 
a complex Bluetooth communication protocol. 

A complex operation is required to collect accurate 
connectivity traces using Bluetooth communication, 
as the device discovery protocol may limit detection 
of the devices in radio proximity. Bluetooth uses a 
special physical channel for devices to discover each 
other. A device becomes discoverable by entering the 
inquiry substate where it can respond to inquiries 
from other devices. The inquiry scan substate is used 
to discover surrounding devices. The discovering 
device iterates (hops) through all possible inquiry scan 
channel frequencies in a pseudo-random fashion. For 
each frequency, it broadcasts an inquiry and listens 
for responses. Therefore, a Bluetooth device cannot 
scan for other devices when the device cannot be in 
discoverable. Bluetooth inquiry can only happen in 
1.28 second intervals. It is reported that an interval of 
4x 1.28 = 5.12 seconds gives a more than 90% chance 
of finding a device. However, there is no available data 
for situations where many devices are present, and no 
precise study has been reported. The Bluetooth standard 
recommends being in the inquiry scan substate for 10.24 
seconds in order to collect all responses in an error-free 
environment. A 10.24 seconds alternation may cause 
missing links, and we therefore deploy 5.12 seconds 
for inquiry. The power consumption of Bluetooth is 
also a critical limitation for the scanning interval. The 
iMote connectivity traces in Haggle [0 use a scanning 
interval of approximately 2 minutes, while the Reality 
Mining project in MIT [6], with cell phones, uses 5 
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inter-contact time (s) xio 4 inter-contact time (s) 
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Fig. 1: INFOCOM 2006: (a) Rank order plot, (b) pdf, (c) semilog histogram with linear bins, and (d) loglog 
histogram for times less than 12 hours, with linear bins 



minutes. The ratio of devices with Bluetooth enabled to 
the total number of devices is around only an average 
15% - 20% of population. The range of Bluetooth varies 
between 10m and 80m, which depends on the device 
class such as cell phones or laptops. In cell phones, the 
Bluetooth range is usually 5 - 10m. We have observed 
that the devices can be detected in a 20m range if there 
are no obstacles, while with obstacles such as a thick 
wall the range drops to 5m (see more detail in lfT8l|[T9l ). 

III. Rhythm and randomness 

In each of the experiments we calculated all possible 
inter-contact times T between any two nodes, where 
ICT is denned as the time between the end of contact 



between two nodes and the start of next contact between 
the same two nodes. Figures 1-3 summarise the ICT 
distribution for the three experiments. In each case, 
the distribution is plotted as (a) a rank order plot 
with double logarithmic axes, (b) a probability density 
function with logarithmic co-ordinate (probability 
density) axis and logarithmic ordinate (inter-contact 
time) axis using exponentially spaced bins (i.e., equal 
bin width in logarithm space = 0.1 decade), (c) a 
histogram with logarithmic co-ordinate (frequency) axis 
and linear ordinate (inter-contact time) axis using 100 
equally-spaced bins (equivalent to a pdf with linearly 
spaced bins to within a constant), and (d) a histogram 
for inter-contact times up to 12 hours with logarithmic 
co-ordinate (frequency) axis and logarithmic ordinate 
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inter-contact time (s) x io 5 inter-contact time (s) 

(c) (d) 

Fig. 2: Cambridge 2: (a) Rank order plot, (b) pdf, (c) semilog histogram with linear bins, and (d) loglog 
histogram for times less than 12 hours, with linear bins 



(inter-contact time) axis using equal bin widths at the 
granularity A = 100 s or 600 s. (The inset in Figure 3c 
shows a double logarithmic histogram using equal bin 
widths of 1800 s.) 

A. Truncated power law distribution 

Considering the rank order plots in Figure 1-3, we 
might suggest as others have done that the ICT tail 
distribution of all three experiments roughly resembles a 
restricted range power law with exponent < 1 (cf Figure 
1 and 2 of |3]). To illustrate this, we performed the 
following simulation: 

1) A set of contact times is calculated for Levy 
walks in a domain bounded by the duration of 



the experiment L (see Table II). Specifically we 
calculate the cumulative sum, t{ = J2)=i Xj, 
where X is a set of N iid samples chosen from 
the Pareto distribution with pdf p{x) ~ x -0-+a) 
in the range A to 100L. The samples are gener- 
ated by picking iid samples Cj from the uniform 
distribution in the range (0,1] and then inverting 
the analytical equation for the Pareto cumulative 
probability distribution to find the value x = Xi 
that yields the value C = Cj. 

2) Divide the contact times into individual trials (i.e., 
trial number = ij modulo L). 

3) Calculate the set of inter-contact times T from 
the time differences between neighbouring contact 
times, Ti = ti + \ — U, omitting inter-contact times 
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inter-contact time (s) « io' inter-contact time (s) 
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Fig. 3: MIT: (a) Rank order plot, (b) pdf, (c) semilog histogram with linear bins (inset shows loglog histogram 
with linear bins), and (d) loglog histogram for times less than 12 hours, with linear bins 



that straddle trials. 

Figure 4a shows a simulated ICT probability 
distribution (solid line) choosing a = 0.4 and other 
parameters corresponding to the configuration of the 
INFOCOM 2006 experiment - A = 100 s, L = 3 days, 
and N = 10,000. It is clear that the simulated 
distribution is only a crude approximation to the actual 
INFOCOM 2006 distribution (dashed line) and other 
structure is evident. 

B. Circadian rhythm 

This is also obvious in the other experiments (e.g., 
the histograms in figures lc, 2c and 3c.) where there 
are significant deviations about any candidate monotonic 



function. Closer inspection reveals much of this deviation 
to be associated with a circadian rhythm, as evidenced by 
the alignment of peaks in the histogram/PDF at integer 
multiples of 24 hours. (Note also a weekly rhythm in 
Figure 3.) 

Nevertheless, the INFOCOM 2006 and MIT 
distributions are well approximated by a power law on 
time scales much less than a day (e.g., < 12 hours, 
see Figure Id and 3d). (This is less obvious in the 
Cambridge 2 experiment (Figure 2d) due to a m 10 min 
periodicity which is likely an experimental artefact). 
This suggests that a better null model of ICT in these 
experiments is a Levy walk in a periodic domain. To 
investigate this we performed the following simulation: 



10 10 10 10 10 10 

inter-contact time (s) inter-contact time (s) 

Fig. 4: Comparison of Levy flight simulation of inter-contact times without (left), and with (right), the 
presence of circadian periodicity. 



1) A set of contact times is calculated for Levy walks 
in a domain bounded by the duration of the experi- 
ment L (see Table II). Specifically we calculate the 
cumulative sum, t{ = Y^j=\Xj, where X is a set 
of N iid samples chosen from the Pareto distribu- 
tion with pdf p(x) ~ x -( l + a ) i n the range A to L. 
The samples are generated by picking iid samples 
d from the uniform distribution in the range (0,1] 
and then inverting the analytical equation for the 
Pareto cumulative probability distribution to find 
the value x = X-i that yields the value C = Cj. 

2) Divide the contact times into days and retain 
only contact times that fall within a working day, 
defined to start at t s h and end at t e h (i.e., 
t s < di = t{ modulo 24hours < t e ). 

3) Divide the contact times into individual trials (i.e., 
trial number = ti modulo L). 

4) Calculate the set of inter-contact times T from 
the time differences between neighbouring contact 
times, Tj = ti+\ — U, omitting inter-contact times 
that straddle trials. 



Figure 4b shows a simulated ICT probability distribution 
(solid line) choosing a = 0.4 and other parameters cor- 
responding to the configuration of the INFOCOM 2006 
experiment: A = 100 s, L = 3 days, and N = 10, 000. 
The simulated distribution compares favourably with 
the actual INFOCOM 2006 distribution (dashed line), 
supporting the null model over a Levy walk confined 
within the domain L but not within the working day. 



IV. Conclusions and Implications 

The distribution of human inter-contact times from three 
experiments of differing durations has been analysed 
using different graphical presentations. This has revealed 
three essential properties of human contact: 

Random, scale-free. On sufficiently short time 
scales, the ICT distribution is approximated by a power 
law consistent with the return times of a Levy flight. 
The value of the stability exponent (a < 1) implies no 
characteristic ICT in the absence of other constraints. 

Truncated. At some time scale the power law 
component is truncated by a constraint on inter- 
contact time. One artificial constraint is the experiment 
itself which prohibits recording ICTs longer than 
the experiment duration. This is demonstrated in the 
simulated ICT distribution in Figure 4a and should 
be considered in comparing results from experiments 
of differing durations. More significantly, another 
constraint is the removal of agents from the contact 
domain. An example of this is movement from work 
to home which suppresses ICTs between agents in 
the same work group on times scales beyond the 
working day. This is demonstrated in the simulated ICT 
distribution in Figure 4b by the truncation of the power 
law component at ICT ~ 10 4 s. 

Periodic. Environmental, biological, and social 
constraints may have rhythms that encourage repeated 
encounters such as the daily to-ing and fro-ing 
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between work and home. This is demonstrated in the 
simulated ICT distribution in Figure 4b by the peak at 
ICT ~6x 10 4 s and ~ 15 x 10 4 s (i.e., separated by 
24 hours). 

These three properties have been previously surmised 
by various different means but evidence of their co- 
existence in the ICT distribution has been overlooked. 
In particular, closer examination of previously published 
ICT distributions (e.g., ifTOll ) reveals deviations about a 
truncated power law consistent with a circadian rhythm. 
Recognition of this rhythm in the empirical distribution 
is important otherwise models of human movement and 
behaviour may be unrealistically modified to generate 
only the scale-free property (e.g., [2]). It also has 
significant implications for building efficient routing 
algorithms and functionality on top of opportunistic 
networks. As a very simple example, clearly a rhythm 
of period P that removes agents from each other for a 
time P/2 reduces the average number of contacts by 
50% over multiple cycles. But its determinism might 
also be exploited to increase communication efficiency. 
For example, the time of the next encounter could be 
estimated at the node and thus selection of the next hop 
could be determined based on the expected shortest time 
to the next encounter. The periodic behaviour of nodes 
could indicate moving from one network partition to 
another and this could be used for temporal clustering 
of nodes, where temporal-based communities could 
be used as a backbone of logical network structure 
for forwarding lfT2l . By these means, mobility-assisted 
forwarding can take advantage of patterns arising 
in the distribution of nodes in time and space. One 
alternative movement model is suggested that combines 
the Levy walk model with models such as the Home 
Cell Mobility Model (e.g., |2|) that incorporate the 
influence of social structure. However the development 
of more complicated models will also present challenges 
in testing them and distinguishing between competing 
models. 
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